Barra, a Modular Functional GPU Simulator for GPGPU
نویسندگان
چکیده
The use of GPUs for general-purpose applications promises huge performance returns for a small investment. However the internal design of such processors is undocumented and many details are unknown, preventing developers to optimize their code for these architectures. One solution is to use functional simulation to determine program behavior and gather statistics when counters are missing or unavailable. In this article we present a GPU functional simulator targeting GPGPU based on the UNISIM framework which takes a NVIDIA cubin file as input.
منابع مشابه
Barra, a Parallel Functional GPGPU Simulator
We present a GPU functional simulator targeting GPGPU based on the UNISIM framework which takes unaltered NVIDIA CUDA executables as input. It simulates the native instruction set of the Tesla architecture at the functional level and generates detailed execution statistics. Simulation speed is competitive with the less-accurate CUDA emulation mode thanks to optimizations which exploit the inher...
متن کاملCrystalGPU: Transparent and Efficient Utilization of GPU Power
General-purpose computing on graphics processing units (GPGPU) has recently gained considerable attention in various domains such as bioinformatics, databases and distributed computing. GPGPU is based on using the GPU as a co-processor accelerator to offload computationally-intensive tasks from the CPU. This study starts from the observation that a number of GPU features (such as overlapping co...
متن کاملFault injection on GPGPU application
Today, with the development of GPU computing techniques in terms of architectures and hardware and software support, people realized that intensive computing workload could be ported to GPU device. Applications could exploit GPUs’ characteristics for parallel computing and gain a significantly high speedup comparing to CPU architecture. However, failures are still unavoidable. People have alrea...
متن کاملA complete and efficient CUDA-sharing solution for HPC clusters
In this paper we detail the key features, architectural design, and implementation of rCUDA, an advanced framework to enable remote and transparent GPGPU acceleration in HPC clusters. rCUDA allows decoupling GPUs from nodes, forming pools of shared accelerators, which brings enhanced flexibility to cluster configurations. This opens the door to configurations with fewer accelerators than nodes,...
متن کاملTowards Multi-tenant GPGPU: Event-driven Programming Model for System-wide Scheduling on Shared GPUs
Graphics processing units (GPUs) are attractive to the generalpurpose computing (GPGPU) beyond the graphics purpose. Sharing GPUs among such GPGPU applications is a key requirement especially for cloud platforms whose resources are utilized by various cloud users. However, consolidating recent GPU applications, referred to as GPU eaters, on a GPU poses a new challenge. Such advanced application...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009